Memory Access Latency
Memory access latency is the time delay between issuing a memory request (read/write) and the moment the data is available to the processor or system component. It is measured in nanoseconds (ns) or CPU cycles, and it plays a critical role in performance-sensitive applications such as databases, in-memory systems, or low-latency services.
Memory Hierarchy & Latency
Modern systems use a hierarchical memory model to balance speed, capacity, and cost.
| Memory Type | Latency (approx) | Size (typical) | Location |
|---|---|---|---|
| CPU Register | 0.25 ns | Bytes | On CPU |
| L1 Cache | 0.5 – 1 ns | ~32 KB | On CPU core |
| L2 Cache | 3 – 10 ns | ~256 KB | On CPU chip |
| L3 Cache | 10 – 30 ns | ~8 MB | Shared on chip |
| RAM (DRAM) | 50 – 100 ns | ~GBs | On motherboard |
| SSD Storage | 50 – 150 μs | ~TBs | PCIe/SATA device |
| HDD | 5 – 10 ms | ~TBs | External disk |
As we move down the hierarchy, latency increases and cost per byte decreases.
Why Memory Latency Matters
- CPU is faster than memory → Even small delays stall execution.
- I/O-bound vs Memory-bound → High latency increases wait time.
- Performance bottlenecks → Especially in high-throughput systems.
- Cache misses lead to expensive memory fetches → Cache-efficient code matters.
Optimization Techniques for Memory Latency
| Strategy | Description |
|---|---|
| Caching | Store frequently accessed data in faster memory |
| Prefetching | Predict and load future memory needs ahead of time |
| Memory locality | Improve access patterns (e.g., access arrays sequentially) |
| Data alignment | Structure data to fit cache lines better |
| Avoiding cache thrashing | Reduce conflicts in cache sets by designing access-friendly structures |
| NUMA-awareness | Place data close to the CPU core using it in NUMA systems |
Example with In-Memory Database
Scenario: A real-time analytics service stores data in memory (Redis, Memcached).
- Accessing hot data in CPU cache: ~1–5 ns (very fast)
- Accessing cold data in RAM: ~100 ns (20× slower)
- Accessing persisted data in SSD (fallback): ~100,000 ns = 100 μs